Goto

Collaborating Authors

 binary number


SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models

arXiv.org Artificial Intelligence

Large reasoning models (LRMs) excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought, resulting in long inference times before arriving at the final answer. To address this challenge, we introduce SPRINT, a novel post-training and inference-time framework designed to enable LRMs to dynamically identify and exploit opportunities for parallelization during their reasoning process. SPRINT incorporates an innovative data curation pipeline that reorganizes natural language reasoning trajectories into structured rounds of long-horizon planning and parallel execution. By fine-tuning LRMs on a small amount of such curated data, the models learn to dynamically identify independent subtasks within extended reasoning processes and effectively execute them in parallel. Through extensive evaluations, we demonstrate that models fine-tuned with the SPRINT framework match the performance of reasoning models on complex domains such as mathematics while generating up to 39% fewer sequential tokens on problems requiring more than 8,000 output tokens. Finally, we observe consistent results transferred to two out-of-distribution tasks, namely GPQA and Countdown, with up to 45% and 65% reduction in average sequential tokens respectively for longer reasoning trajectories, while matching the performance of the fine-tuned reasoning model.


Extrapolative ML Models for Copolymers

arXiv.org Artificial Intelligence

Machine learning models have been progressively used for predicting materials properties. These models can be built using pre-existing data and are useful for rapidly screening the physicochemical space of a material, which is astronomically large. However, ML models are inherently interpolative, and their efficacy for searching candidates outside a material's known range of property is unresolved. Moreover, the performance of an ML model is intricately connected to its learning strategy and the volume of training data. Here, we determine the relationship between the extrapolation ability of an ML model, the size and range of its training dataset, and its learning approach. We focus on a canonical problem of predicting the properties of a copolymer as a function of the sequence of its monomers. Tree search algorithms, which learn the similarity between polymer structures, are found to be inefficient for extrapolation. Conversely, the extrapolation capability of neural networks and XGBoost models, which attempt to learn the underlying functional correlation between the structure and property of polymers, show strong correlations with the volume and range of training data. These findings have important implications on ML-based new material development.


Bits, Bytes, and Binary

#artificialintelligence

Ever wondered what they really are โ€“ bits, bytes and binary? But have you ever wondered how these massive amounts of data are actually stored? Welcome to computers are their core: bits. Ever piece of information, in most current computing systems, whether they be your desktop PC, your mobile or the intelligent screen on your smart fridge, is stored by means of'bits'. The single most granular piece of information a computer can "understand" and process is a bit.


A Lower Bound on DNNF Encodings of Pseudo-Boolean Constraints

arXiv.org Artificial Intelligence

Two major considerations when encoding pseudo-Boolean (PB) constraints into SAT are the size of the encoding and its propagation strength, that is, the guarantee that it has a good behaviour under unit propagation. Several encodings with propagation strength guarantees rely upon prior compilation of the constraints into DNNF (decomposable negation normal form), BDD (binary decision diagram), or some other sub-variants. However it has been shown that there exist PB-constraints whose ordered BDD (OBDD) representations, and thus the inferred CNF encodings, all have exponential size. Since DNNFs are more succinct than OBDDs, preferring encodings via DNNF to avoid size explosion seems a legitimate choice. Yet in this paper, we prove the existence of PB-constraints whose DNNFs all require exponential size.


A Layered Learning Approach to Scaling in Learning Classifier Systems for Boolean Problems

arXiv.org Artificial Intelligence

Learning classifier systems (LCSs) originated from cognitive-science research but migrated such that LCS became powerful classification techniques. Modern LCSs can be used to extract building blocks of knowledge to solve more difficult problems in the same or a related domain. Recent works on LCSs showed that the knowledge reuse through the adoption of Code Fragments, GP-like tree-based programs, into LCSs could provide advances in scaling. However, since solving hard problems often requires constructing high-level building blocks, which also results in an intractable search space, a limit of scaling will eventually be reached. Inspired by human problem-solving abilities, XCSCF* can reuse learned knowledge and learned functionality to scale to complex problems by transferring them from simpler problems using layered learning. However, this method was unrefined and suited to only the Multiplexer problem domain. In this paper, we propose improvements to XCSCF* to enable it to be robust across multiple problem domains. This is demonstrated on the benchmarks Multiplexer, Carry-one, Majority-on, and Even-parity domains. The required base axioms necessary for learning are proposed, methods for transfer learning in LCSs developed and learning recast as a decomposition into a series of subordinate problems. Results show that from a conventional tabula rasa, with only a vague notion of what subordinate problems might be relevant, it is possible to capture the general logic behind the tested domains, so the advanced system is capable of solving any individual n-bit Multiplexer, n-bit Carry-one, n-bit Majority-on, or n-bit Even-parity problem.


Learning Numeracy: Binary Arithmetic with Neural Turing Machines

arXiv.org Artificial Intelligence

Computer programs are composed of three fundamental mechanisms: elementary operations, logical flow control and memory usage. In the history of neural networks [19] only the use of elementary operations have been extensively explored since so far, but during the last few years the coupling with an external piece of memory is gaining popularity [24]. Neural Turing Machines (NTMs) were developed in 2014 at Google DeepMind Labs [8] in an attempt to couple a neural network with an external memory component in order to improve long-term dependency learning in sequences prediction. Although recurrent neural networks (RNNs) are Turing-complete on their own [20], the difficulties that arise during their training (like the vanishing or the exploding gradient problems [18, 15]) prevented them from being employed in learning more complex tasks, for example algorithmic ones [27]. NTMs derive their name from the analogy with standard Turing Machines (TMs) [22] in addressing an infinite (or at least large enough to be considered so) portion of memory with an attentional mechanism similar to the read/write head of a TM. In contrast to a standard TM, a NTM is a "differentiable computer" that can be trained using gradient descent methods and can therefore learn its own "program" independently (attempts using Neuroevolution [9] and reinforcement learning [26] have also been made). In human brains, the most similar process to an algorithm is the concept of "working memory" [1]: this mechanism allows the brain to rapidly create "variables" [11] by storing short-term information and manipulating them in a rulebased way [17]. The analogy with an algorithm is evident, and a NTM is similar to this process because it can learn tasks in which it is required to manipulate rapidly-created variables. Also the attention mechanism in a NTM is similar to the way the working memory bounds its information in certain slots of memory in the brain [6], despite the fact that a NTM autonomously learns how to do that.


Dewey -- The First Artificial Intelligence Novelist โ€“ Alvaro Videla โ€“ Medium

#artificialintelligence

There have been many kinds of books, with many kinds of meanings. This one book was special because it was the first fictional story produced via artificial intelligence. It was the first book in the sense that its contents made sense. Before this book, all other attempts of letting an AI write a book had produced things that were pastiches of randomness. A couples of sentences here and there surrounded by text that made no sense.


Exploiting Single-Cycle Symmetries in Continuous Constraint Problems

arXiv.org Artificial Intelligence

Symmetries in discrete constraint satisfaction problems have been explored and exploited in the last years, but symmetries in continuous constraint problems have not received the same attention. Here we focus on permutations of the variables consisting of one single cycle. We propose a procedure that takes advantage of these symmetries by interacting with a continuous constraint solver without interfering with it. A key concept in this procedure are the classes of symmetric boxes formed by bisecting a n-dimensional cube at the same point in all dimensions at the same time. We analyze these classes and quantify them as a function of the cube dimensionality. Moreover, we propose a simple algorithm to generate the representatives of all these classes for any number of variables at very high rates. A problem example from the chemical and#64257;eld and the cyclic n-roots problem are used to show the performance of the approach in practice.


Exploiting Single-Cycle Symmetries in Continuous Constraint Problems

Journal of Artificial Intelligence Research

Symmetries in discrete constraint satisfaction problems have been explored and exploited in the last years, but symmetries in continuous constraint problems have not received the same attention. Here we focus on permutations of the variables consisting of one single cycle. We propose a procedure that takes advantage of these symmetries by interacting with a continuous constraint solver without interfering with it. A key concept in this procedure are the classes of symmetric boxes formed by bisecting a n-dimensional cube at the same point in all dimensions at the same time. We analyze these classes and quantify them as a function of the cube dimensionality. Moreover, we propose a simple algorithm to generate the representatives of all these classes for any number of variables at very high rates. A problem example from the chemical field and the cyclic n-roots problem are used to show the performance of the approach in practice.